Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Progressive ratio mask-based adaptive noise estimation method
Jianqing GAO, Yanhui TU, Feng MA, Zhonghua FU
Journal of Computer Applications    2023, 43 (4): 1303-1308.   DOI: 10.11772/j.issn.1001-9081.2022030384
Abstract322)   HTML2)    PDF (1425KB)(61)       Save

Deep learning based speech enhancement algorithms typically perform better than the traditional noise suppression based speech enhancement algorithms. However, deep learning based speech enhancement algorithms usually do not work well when there exists mismatch between training data and test data. Aiming at the above problem, a novel Progressive Ratio Mask (PRM)-based Adaptive Noise Estimation (PRM-ANE) method was proposed, and this method was used for the preprocessing of the speech recognition system. In the method, Improved Minima Controlled Recursive Averaging (IMCRA) algorithm with frame-level noise tracking capability and utterance-level deep progressive learning algorithm nonlinear interactions between speech and noise were used comprehensively. Firstly, two Dimensional-Convolutional Neural Network (2D-CNN) was adopted to learn PRM, which increased with the increase of Signal-to-Noise Ratio (SNR). Then, the PRMs at sentence level were combined by the conventional frame-level speech enhancement algorithm to perform speech enhancement. Finally, the enhanced speech based on the multi-level information fusion was directly fed into speech recognition system to improve the performance of the system. Experimental results on the CHiME-4 real test set show that the proposed method can achieve a relative Word Error Rate (WER) of 7.42%, which is 51.41% lower than that of IMCRA speech enhancement method. Experimental results show that the proposed enhancement method can effectively improve the performance of downstream recognition tasks.

Table and Figures | Reference | Related Articles | Metrics